Skip to content

fix: fix gpt oss export + bump mbridge#2249

Open
yuki-97 wants to merge 4 commits intomainfrom
yukih/fix-gpt-oss
Open

fix: fix gpt oss export + bump mbridge#2249
yuki-97 wants to merge 4 commits intomainfrom
yukih/fix-gpt-oss

Conversation

@yuki-97
Copy link
Copy Markdown
Contributor

@yuki-97 yuki-97 commented Apr 10, 2026

Previously we will get gpt-oss model with error layout from examples/converters/convert_megatron_to_hf.py, this PR will fix it. See NVIDIA-NeMo/Megatron-Bridge#3271 for more details.

Validate Steps:

  1. Import hf to megatron, train one step, and save the megatron ckpt.
    NRL_FORCE_REBUILD_VENVS=true \
    uv run python examples/run_grpo.py \
        --config examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml \
        grpo.max_num_steps=1 \
        policy.max_total_sequence_length=512 \
        logger.wandb_enabled=false \
        logger.tensorboard_enabled=false \
        checkpointing.enabled=True \
        checkpointing.checkpoint_dir=results/grpo-gptoss-20b-8n8g-megatron-test-export-transpose \
        checkpointing.save_period=1
    
  2. Convert saved megatron ckpt to hf.
    uv run --extra mcore python examples/converters/convert_megatron_to_hf.py \
        --config results/grpo-gptoss-20b-8n8g-megatron-test-export-transpose/step_1/config.yaml \
        --hf-model-name unsloth/gpt-oss-20b-BF16 \
        --megatron-ckpt-path results/grpo-gptoss-20b-8n8g-megatron-test-export-transpose/step_1/policy/weights/iter_0000000 \
        --hf-ckpt-path results/step_1_hf
    
  3. Use the converted hf ckpt to train again.
    uv run python examples/run_grpo.py \
        --config examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml \
        policy.model_name=results/step_1_hf \
        grpo.max_num_steps=1 \
        policy.max_total_sequence_length=512 \
        logger.wandb_enabled=false \
        logger.tensorboard_enabled=false \
        checkpointing.enabled=false
    

Results of "Validate Steps" 3:
Before this PR:

  • Generation KL Error: 13.0520
  • Avg Reward: 0.0000

After this PR:

  • Generation KL Error: 0.0009
  • Avg Reward: 0.3960

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

yuki-97 added 4 commits April 11, 2026 02:04
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
@yuki-97 yuki-97 force-pushed the yukih/fix-gpt-oss branch from 6fa2609 to 41fc91d Compare April 11, 2026 12:59
@yuki-97 yuki-97 marked this pull request as ready for review April 11, 2026 13:00
@yuki-97 yuki-97 requested review from a team as code owners April 11, 2026 13:00
@yuki-97 yuki-97 added the CI:L1 Run doctests, unit tests, and functional tests label Apr 11, 2026
@yuki-97
Copy link
Copy Markdown
Contributor Author

yuki-97 commented Apr 11, 2026

/ok to test 41fc91d

@github-actions
Copy link
Copy Markdown

✅ Submodule Fast-Forward Check Results

Check based on commit: 41fc91d (PR #2249 from yukih/fix-gpt-oss)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@yuki-97 yuki-97 added the r0.6.0 label Apr 11, 2026
@yuki-97 yuki-97 changed the title fix: fix gpt oss export fix: fix gpt oss export + bump mbridge Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests r0.6.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant